Skip to content

feat: implement explicit create_cache API#138

Merged
seanbrar merged 14 commits intomainfrom
feature/explicit-caching-api
Mar 5, 2026
Merged

feat: implement explicit create_cache API#138
seanbrar merged 14 commits intomainfrom
feature/explicit-caching-api

Conversation

@seanbrar
Copy link
Owner

@seanbrar seanbrar commented Mar 4, 2026

Summary

Replaces the implicit Config(enable_caching=True, ttl_seconds=...) mechanism with an explicit create_cache()CacheHandleOptions(cache=handle) flow. This is a breaking change to the caching API.

Why

The implicit approach coupled cache lifecycle to generation calls — caches were created as a side effect of run(), with no way to control timing, share across calls, or handle errors independently. The explicit API separates cache creation from usage, giving callers control over when uploads and cache warming happen.

Design decisions

  • CacheHandle is opaque and frozen. It carries name, model, provider, and expires_at but callers don't inspect internals — they just pass it to Options(cache=handle).
  • Eager validation at the boundary. create_cache_impl validates all inputs (types, provider capability, tool structure) before any I/O. build_plan validates handle compatibility (provider/model match, expiration, conflicting options) before any network calls. This ordering is load-bearing — see commit history for the cost of getting it wrong.
  • Single-flight deduplication. Concurrent create_cache() calls with identical content share one upload + one API call via singleflight_cached. The registry is keyed by content hash, scoped by provider and API key.
  • Config no longer owns caching config. enable_caching and ttl_seconds are removed from Config. TTL is now per-cache, passed to create_cache(ttl_seconds=...).

What changed

Area Change
src/pollux/__init__.py New create_cache() public API, CacheHandle export
src/pollux/cache.py CacheHandle, CacheRegistry, create_cache_impl, content-hash keying, file upload dedup
src/pollux/config.py Removed enable_caching, ttl_seconds fields
src/pollux/plan.py Validates cache handles (expiration, provider/model match, option conflicts)
src/pollux/execute.py Simplified — no longer creates caches implicitly
src/pollux/options.py Options.cache field accepts CacheHandle
src/pollux/providers/ create_cache() method on provider interface; wrap_provider_error passes through ConfigurationError
tests/test_pipeline.py ~550 lines of cache boundary tests
docs/, cookbook/ Updated for new API

Related issue

None

Test plan

  • just check passes (lint + typecheck + 176 tests)
  • Cache boundary tests cover: handle creation, registry hit/miss, expired handles, provider/model mismatch, conflicting options (system_instruction, tools, tool_choice, sources), file upload deduplication, single-flight coordination, TTL validation, type validation for tools and system_instruction, unserializable tools, unsupported providers

Notes

  • The validation-before-I/O ordering in create_cache_impl is documented with a scaling note: if the parameter surface grows, a validated CacheSpec dataclass would keep the boundary manageable.

This commit refactors the persistent context caching mechanism in Pollux, replacing the implicit `enable_caching=True` flag in `Config` with an explicit `create_cache` API. This decouples cache upload/warm-up from text generation, allowing for stricter validation (e.g., rejecting `system_instruction` or `tools` usage alongside a `CacheHandle` to match Gemini API constraints) and more predictable caching behavior.
@codecov
Copy link

codecov bot commented Mar 4, 2026

Codecov Report

❌ Patch coverage is 85.27132% with 19 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/pollux/cache.py 89.02% 9 Missing ⚠️
src/pollux/providers/gemini.py 25.00% 6 Missing ⚠️
src/pollux/options.py 66.66% 2 Missing ⚠️
src/pollux/providers/_errors.py 50.00% 1 Missing ⚠️
src/pollux/providers/openai.py 0.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sean Brar added 13 commits March 4, 2026 02:34
Addresses P1 and P2 findings from code review:
- Raises ConfigurationError if options.cache is used alongside sources.
- Validates tools as dictionaries before creation in Gemini provider.
Updates caching.md to not pass sources alongside a cache handle, which is now explicitly rejected by ConfigurationError.
Move create_cache implementation from __init__.py into cache.py
(create_cache_impl, _resolve_file_parts, module-level _registry),
leaving __init__.create_cache as a thin provider-lifecycle wrapper.

Shift cache-handle conflict validation (provider/model mismatch,
system_instruction/tools/tool_choice/sources conflicts) from
execute_plan() into build_plan() so errors surface at planning time
before any network I/O. Retain a single runtime persistent_cache
capability check in execute_plan() as a safety net for hand-built
handles.

The new _resolve_file_parts memoizes uploads by (file_path, mime_type)
to avoid duplicate uploads for repeated file sources within a single
create_cache() call.
…alls

Move _resolve_file_parts() into the single-flight work function inside
get_or_create_cache() so concurrent callers for the same cache key share
both uploads and cache creation.  Previously uploads ran before the
single-flight boundary, causing duplicate uploads when two coroutines
raced past the registry miss.

get_or_create_cache() now accepts raw_parts (unresolved placeholders)
and resolves them inside _work().  Add
test_cache_single_flight_deduplicates_file_uploads to verify
upload_calls==1 and cache_calls==1 under concurrency.
Include api_key in compute_cache_key() so different credentials for the
same provider/model produce distinct cache entries.  Prevents silent
cross-account handle reuse in multi-tenant or multi-key scenarios.

Also fix the create_cache() docstring example which referenced an
undefined `config` variable (now uses `cfg` consistently).
- Validate tool items are dicts in create_cache_impl before uploads,
  preventing wasted file uploads on invalid input
- Validate system_instruction type at the API boundary, converting a
  raw TypeError into a ConfigurationError with hint
- Pass through ConfigurationError in wrap_provider_error instead of
  re-wrapping as CacheError
An expired CacheHandle passed via Options(cache=handle) was silently
accepted, leading to a cryptic provider error. Now caught eagerly with
a clear ConfigurationError before any network I/O.
@seanbrar seanbrar merged commit 86d2039 into main Mar 5, 2026
10 checks passed
@seanbrar seanbrar deleted the feature/explicit-caching-api branch March 5, 2026 05:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant